{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ \\newcommand{\\IR}{\\mathbb R}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Constructing kernels\n",
    "\n",
    "$K_1,K_2: \\IR^n\\times \\IR^n\\to \\IR, K_3:\\IR^d\\times \\IR^d\\to \\IR$ are kernels, $a\\geq 0$ a nonnegative real number, $f:\\IR^n \\to \\IR$, $\\phi:\\IR^n\\to \\IR^d$ are functions and $p(x)\\in \\IR[x]$ is a polynomial with nonnegative coefficients."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (a)\n",
    "\n",
    "$$ K(x,z) = K_1(x,z)+K_2(x,z) $$\n",
    "is a kernel because the sum of two PSD matrices is PSD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (b)\n",
    "$$ K(x,z) = K_1(x,z)-K_2(x,z)$$\n",
    "is not necessarily a kernel. Counterexample: $K_1(x,z)=0$ und $K_2(x,z)=1$ for all $x,z$ are kernels, but $K(x,z)=-1$ is obviously not."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (c)\n",
    "\n",
    "$$ K(x,z)=aK_1(x,z)$$\n",
    "is a kernel, because the set of PSD matrices is closed under multiplication with positive real numbers."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (d)\n",
    "\n",
    "$$ K(x,z)=-aK_1(x,z)$$\n",
    "would only be a kernel if $a\\leq 0$ or $K_1=0$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (e)\n",
    "$$ K(x,z) = K_1(x,z)K_2(x,z) $$\n",
    "\n",
    "is a kernel. By Mercer's theorem we can prove this by showing that for all PSD $m\\times m$-matrices $A,B$ the Hadamard product $ A \\circ B := (a_{ij}b_{ij})$, i.e. the $m\\times m$-matrix obtained by multiplying $A$ and $B$ componentwise, is also PSD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First note that for $v,w\\in \\IR^m$ we have \n",
    "$$\\begin{align*} \n",
    "(vv^T) \\circ (ww^T) &= ((v_iv_j)) \\circ ((w_iw_j)) \\\\\n",
    "&= (v_iw_iv_jw_j) \\\\\n",
    "&= ((v_iw_i)) ((v_jw_j))^T \\\\\n",
    "&= (v\\circ w) (v\\circ w)^T,\n",
    "\\end{align*}$$\n",
    "i.e. the Hadamard product of rank one PSD matrices is PSD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now recall that by the spectral theorem every PSD matrix has an orthonormal basis of eigenvectors and that all eigenvalues of a PSD matrix are nonnegative real numbers. \n",
    "\n",
    "This means in particular that there are $\\mu_k,\\lambda_k \\geq 0$ and $v_k,w_k \\in \\IR^m$ such that\n",
    "$$\\begin{align*}\n",
    "A &= \\sum_{k=1}^m \\mu_k v_kv_k^T,\\\\\n",
    "B &= \\sum_{k=1}^m \\lambda_k w_kw_k^T.\n",
    "\\end{align*}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then by bilinearity of the Hadamard product we conclude that\n",
    "$$ \\begin{align*}\n",
    "A\\circ B &= \\left(\\sum_{k=1}^m \\mu_k v_kv_k^T\\right)\\circ \\left(\\sum_{k=1}^m \\lambda_k w_kw_k^T \\right)\\\\\n",
    "&= \\sum_{1\\leq k,l \\leq m} \\mu_k\\lambda_l \\left( (v_kv_k^T) \\circ (w_lw_l^T)\\right)\\\\\n",
    "&= \\sum_{1\\leq k,l \\leq m} \\mu_k\\lambda_l \\left( (v_k\\circ w_l) (v_k \\circ w_l)^T\\right)\n",
    "\\end{align*}$$\n",
    "is PSD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (f)\n",
    "\n",
    "$$ K(x,z) = f(x)f(z)$$\n",
    "is a kernel by Mercer's theorem, because if $x_1,\\ldots, x_m\\in \\IR^n$ then the matrix \n",
    "$$ (K(x_i,x_j)) = (f(x_i)f(x_j)) = \\begin{pmatrix}f(x_1) \\\\ \\vdots \\\\ f(x_m)\\end{pmatrix}\\begin{pmatrix}f(x_1) & \\cdots & f(x_m)\\end{pmatrix}$$\n",
    "is PSD."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (g)\n",
    "$$ K(x,z) = K_3(\\phi(x),\\phi(z))$$\n",
    "is a kernel, because for $x_1,\\ldots, x_m\\in \\IR^n$ the matrix\n",
    "$$ (K(x_i,x_j)) = (K_3(\\phi(x_i),\\phi(x_j))$$\n",
    "is PSD by Mercer's theorem applied to $K_3$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## (h)\n",
    "$$ K(x,z) = p(K_1(x,z))$$\n",
    "is a kernel by (a),(b) and (e) because $p$ only has nonnegative coefficients."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
